Overview

Dataset statistics

Number of variables27
Number of observations2219
Missing cells3660
Missing cells (%)6.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 MiB
Average record size in memory741.0 B

Variable types

NUM17
CAT10

Reproduction

Analysis started2020-02-29 23:40:29.278805
Analysis finished2020-02-29 23:41:19.748086
Versionpandas-profiling v2.5.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
Data Sorteio has a high cardinality: 2219 distinct values High cardinality
Arrecadacao_Total has a high cardinality: 1150 distinct values High cardinality
Cidade has a high cardinality: 168 distinct values High cardinality
Rateio_Sena has a high cardinality: 514 distinct values High cardinality
Rateio_Quina has a high cardinality: 2219 distinct values High cardinality
Rateio_Quadra has a high cardinality: 2174 distinct values High cardinality
Valor_Acumulado has a high cardinality: 1707 distinct values High cardinality
Acumulado_Mega_da_Virada has a high cardinality: 1217 distinct values High cardinality
Ganhadores_Quadra is highly correlated with Ganhadores_QuinaHigh Correlation
Ganhadores_Quina is highly correlated with Ganhadores_QuadraHigh Correlation
D1 is highly correlated with 1 DezenaHigh Correlation
1 Dezena is highly correlated with D1High Correlation
D2 is highly correlated with 2 DezenaHigh Correlation
2 Dezena is highly correlated with D2High Correlation
D3 is highly correlated with 3 DezenaHigh Correlation
3 Dezena is highly correlated with D3High Correlation
D4 is highly correlated with 4 DezenaHigh Correlation
4 Dezena is highly correlated with D4High Correlation
D5 is highly correlated with 5 DezenaHigh Correlation
5 Dezena is highly correlated with D5High Correlation
D6 is highly correlated with 6 DezenaHigh Correlation
6 Dezena is highly correlated with D6High Correlation
Cidade has 1931 (87.0%) missing values Missing
UF has 1729 (77.9%) missing values Missing
Ganhadores_Sena is highly skewed (γ1 = 25.32134224) Skewed
Ganhadores_Sena has 1706 (76.9%) zeros Zeros
Estimativa_Pr�mio has 860 (38.8%) zeros Zeros

Variables

Concurso
Real number (ℝ≥0)

UNIFORM
UNIQUE
Distinct count2219
Unique (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1110
Minimum1
Maximum2219
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile111.9
Q1555.5
median1110
Q31664.5
95-th percentile2108.1
Maximum2219
Range2218
Interquartile range (IQR)1109

Descriptive statistics

Standard deviation640.714445
Coefficient of variation (CV)0.5772202207
Kurtosis-1.2
Mean1110
Median Absolute Deviation (MAD)554.7498873
Skewness0
Sum2463090
Variance410515
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[1.000e+00 2.219e+03], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
2047 1 < 0.1%
 
1304 1 < 0.1%
 
1316 1 < 0.1%
 
1314 1 < 0.1%
 
1312 1 < 0.1%
 
1310 1 < 0.1%
 
1308 1 < 0.1%
 
1306 1 < 0.1%
 
1302 1 < 0.1%
 
1286 1 < 0.1%
 
Other values (2209) 2209 99.5%
 
ValueCountFrequency (%) 
1 1 < 0.1%
 
2 1 < 0.1%
 
3 1 < 0.1%
 
4 1 < 0.1%
 
5 1 < 0.1%
 
ValueCountFrequency (%) 
2219 1 < 0.1%
 
2218 1 < 0.1%
 
2217 1 < 0.1%
 
2216 1 < 0.1%
 
2215 1 < 0.1%
 

Data Sorteio
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE
Distinct count2219
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
10/01/2009
 
1
13/11/1999
 
1
05/07/2008
 
1
05/03/2016
 
1
27/03/2002
 
1
Other values (2214)
2214
ValueCountFrequency (%) 
10/01/2009 1 < 0.1%
 
13/11/1999 1 < 0.1%
 
05/07/2008 1 < 0.1%
 
05/03/2016 1 < 0.1%
 
27/03/2002 1 < 0.1%
 
03/10/2012 1 < 0.1%
 
21/05/2016 1 < 0.1%
 
17/10/2009 1 < 0.1%
 
28/11/2015 1 < 0.1%
 
03/10/2009 1 < 0.1%
 
Other values (2209) 2209 99.5%
 

Length

Max length10
Mean length10
Min length10
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

1 Dezena
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.56106354
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile4
Q116
median31
Q346
95-th percentile57
Maximum60
Range59
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.31535952
Coefficient of variation (CV)0.5665823605
Kurtosis-1.197267312
Mean30.56106354
Median Absolute Deviation (MAD)14.94274467
Skewness-0.02374603654
Sum67815
Variance299.8216753
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 59.5 60. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
28 53 2.4%
 
4 49 2.2%
 
49 47 2.1%
 
30 46 2.1%
 
47 46 2.1%
 
59 44 2.0%
 
35 43 1.9%
 
32 43 1.9%
 
44 43 1.9%
 
27 43 1.9%
 
Other values (50) 1762 79.4%
 
ValueCountFrequency (%) 
1 34 1.5%
 
2 43 1.9%
 
3 27 1.2%
 
4 49 2.2%
 
5 38 1.7%
 
ValueCountFrequency (%) 
60 36 1.6%
 
59 44 2.0%
 
58 26 1.2%
 
57 31 1.4%
 
56 39 1.8%
 

2 Dezena
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.08292023
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile3
Q115
median30
Q345
95-th percentile57
Maximum60
Range59
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.40237153
Coefficient of variation (CV)0.5784801276
Kurtosis-1.237111547
Mean30.08292023
Median Absolute Deviation (MAD)15.15949172
Skewness0.002090724933
Sum66754
Variance302.842535
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 60. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
5 55 2.5%
 
32 49 2.2%
 
17 48 2.2%
 
21 47 2.1%
 
53 47 2.1%
 
10 47 2.1%
 
42 46 2.1%
 
8 45 2.0%
 
33 45 2.0%
 
11 44 2.0%
 
Other values (50) 1746 78.7%
 
ValueCountFrequency (%) 
1 40 1.8%
 
2 37 1.7%
 
3 41 1.8%
 
4 31 1.4%
 
5 55 2.5%
 
ValueCountFrequency (%) 
60 25 1.1%
 
59 35 1.6%
 
58 37 1.7%
 
57 30 1.4%
 
56 42 1.9%
 

3 Dezena
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.79134745
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile4
Q116
median31
Q346
95-th percentile58
Maximum60
Range59
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.36442257
Coefficient of variation (CV)0.5639383791
Kurtosis-1.189359197
Mean30.79134745
Median Absolute Deviation (MAD)14.99273451
Skewness-0.0105751407
Sum68326
Variance301.5231713
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 60. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
18 52 2.3%
 
27 52 2.3%
 
54 48 2.2%
 
58 47 2.1%
 
56 47 2.1%
 
4 46 2.1%
 
24 46 2.1%
 
29 45 2.0%
 
37 44 2.0%
 
34 44 2.0%
 
Other values (50) 1748 78.8%
 
ValueCountFrequency (%) 
1 41 1.8%
 
2 38 1.7%
 
3 31 1.4%
 
4 46 2.1%
 
5 32 1.4%
 
ValueCountFrequency (%) 
60 31 1.4%
 
59 41 1.8%
 
58 47 2.1%
 
57 36 1.6%
 
56 47 2.1%
 

4 Dezena
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.81478143
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile4
Q116
median31
Q345
95-th percentile58
Maximum60
Range59
Interquartile range (IQR)29

Descriptive statistics

Standard deviation17.27227036
Coefficient of variation (CV)0.5605189963
Kurtosis-1.182374478
Mean30.81478143
Median Absolute Deviation (MAD)14.9065153
Skewness-0.02897220929
Sum68378
Variance298.3313233
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 59.5 60. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
37 54 2.4%
 
36 48 2.2%
 
54 46 2.1%
 
29 45 2.0%
 
60 45 2.0%
 
18 44 2.0%
 
1 44 2.0%
 
43 44 2.0%
 
53 43 1.9%
 
5 43 1.9%
 
Other values (50) 1763 79.5%
 
ValueCountFrequency (%) 
1 44 2.0%
 
2 32 1.4%
 
3 24 1.1%
 
4 35 1.6%
 
5 43 1.9%
 
ValueCountFrequency (%) 
60 45 2.0%
 
59 34 1.5%
 
58 40 1.8%
 
57 33 1.5%
 
56 31 1.4%
 

5 Dezena
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.42361424
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile4
Q115
median31
Q345
95-th percentile57
Maximum60
Range59
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.18679273
Coefficient of variation (CV)0.564916206
Kurtosis-1.198977134
Mean30.42361424
Median Absolute Deviation (MAD)14.89063825
Skewness-0.01804537533
Sum67510
Variance295.3858443
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 59.5 60. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
35 48 2.2%
 
28 47 2.1%
 
16 46 2.1%
 
44 45 2.0%
 
10 45 2.0%
 
45 45 2.0%
 
12 45 2.0%
 
52 44 2.0%
 
34 44 2.0%
 
24 44 2.0%
 
Other values (50) 1766 79.6%
 
ValueCountFrequency (%) 
1 30 1.4%
 
2 41 1.8%
 
3 39 1.8%
 
4 37 1.7%
 
5 35 1.6%
 
ValueCountFrequency (%) 
60 37 1.7%
 
59 35 1.6%
 
58 32 1.4%
 
57 34 1.5%
 
56 33 1.5%
 

6 Dezena
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.22352411
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile4
Q115.5
median30
Q345
95-th percentile57
Maximum60
Range59
Interquartile range (IQR)29.5

Descriptive statistics

Standard deviation17.20366953
Coefficient of variation (CV)0.5692145452
Kurtosis-1.196924902
Mean30.22352411
Median Absolute Deviation (MAD)14.87599069
Skewness0.01582472468
Sum67066
Variance295.9662453
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 60.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
23 52 2.3%
 
33 47 2.1%
 
17 45 2.0%
 
34 44 2.0%
 
10 44 2.0%
 
30 44 2.0%
 
53 43 1.9%
 
22 43 1.9%
 
27 43 1.9%
 
4 42 1.9%
 
Other values (50) 1772 79.9%
 
ValueCountFrequency (%) 
1 32 1.4%
 
2 35 1.6%
 
3 40 1.8%
 
4 42 1.9%
 
5 42 1.9%
 
ValueCountFrequency (%) 
60 30 1.4%
 
59 31 1.4%
 
58 33 1.5%
 
57 40 1.8%
 
56 36 1.6%
 

Arrecadacao_Total
Categorical

HIGH CARDINALITY
Distinct count1150
Unique (%)51.8%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
1070
57663666
 
1
21700966
 
1
29670543
 
1
35988757
 
1
Other values (1145)
1145
ValueCountFrequency (%) 
0 1070 48.2%
 
57663666 1 < 0.1%
 
21700966 1 < 0.1%
 
29670543 1 < 0.1%
 
35988757 1 < 0.1%
 
34804982 1 < 0.1%
 
61880258 1 < 0.1%
 
45923384,5 1 < 0.1%
 
18310616 1 < 0.1%
 
43268162 1 < 0.1%
 
Other values (1140) 1140 51.4%
 

Length

Max length11
Mean length4.935105904
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

Ganhadores_Sena
Real number (ℝ≥0)

SKEWED
ZEROS
Distinct count11
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3537629563
Minimum0
Maximum52
Zeros1706
Zeros (%)76.9%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum52
Range52
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.380871525
Coefficient of variation (CV)3.903380781
Kurtosis897.9657945
Mean0.3537629563
Median Absolute Deviation (MAD)0.5439563798
Skewness25.32134224
Sum785
Variance1.906806167
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 0. 0.5 1.5 2.5 4.5 6.5 52. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 1706 76.9%
 
1 383 17.3%
 
2 88 4.0%
 
3 23 1.0%
 
4 11 0.5%
 
5 2 0.1%
 
6 2 0.1%
 
17 1 < 0.1%
 
15 1 < 0.1%
 
7 1 < 0.1%
 
ValueCountFrequency (%) 
0 1706 76.9%
 
1 383 17.3%
 
2 88 4.0%
 
3 23 1.0%
 
4 11 0.5%
 
ValueCountFrequency (%) 
52 1 < 0.1%
 
17 1 < 0.1%
 
15 1 < 0.1%
 
7 1 < 0.1%
 
6 2 0.1%
 

Cidade
Categorical

HIGH CARDINALITY
MISSING
Distinct count168
Unique (%)58.3%
Missing1931
Missing (%)87.0%
Memory size17.5 KiB
RIO DE JANEIRO
 
22
S�O PAULO
 
22
BRAS�LIA
 
13
Curitiba
 
10
SALVADOR
 
9
Other values (163)
212
ValueCountFrequency (%) 
RIO DE JANEIRO 22 1.0%
 
S�O PAULO 22 1.0%
 
BRAS�LIA 13 0.6%
 
Curitiba 10 0.5%
 
SALVADOR 9 0.4%
 
CURITIBA 5 0.2%
 
RECIFE 5 0.2%
 
FORTALEZA 5 0.2%
 
VIT�RIA 4 0.2%
 
PORTO ALEGRE 4 0.2%
 
Other values (158) 189 8.5%
 
(Missing) 1931 87.0%
 

Length

Max length25
Mean length3.882830104
Min length3
ValueCountFrequency (%) 
Uppercase_Letter 22 57.9%
 
Lowercase_Letter 14 36.8%
 
Other_Symbol 1 2.6%
 
Space_Separator 1 2.6%
 
ValueCountFrequency (%) 
Latin 36 94.7%
 
Common 2 5.3%
 
ValueCountFrequency (%) 
ASCII 37 97.4%
 
Specials 1 2.6%
 

UF
Categorical

MISSING
Distinct count27
Unique (%)5.5%
Missing1729
Missing (%)77.9%
Memory size17.5 KiB
SP
102
RJ
59
MG
56
PR
55
DF
 
26
Other values (22)
192
ValueCountFrequency (%) 
SP 102 4.6%
 
RJ 59 2.7%
 
MG 56 2.5%
 
PR 55 2.5%
 
DF 26 1.2%
 
RS 26 1.2%
 
BA 25 1.1%
 
ES 18 0.8%
 
GO 15 0.7%
 
CE 14 0.6%
 
Other values (17) 94 4.2%
 
(Missing) 1729 77.9%
 

Length

Max length3
Mean length2.779179811
Min length2
ValueCountFrequency (%) 
Uppercase_Letter 18 90.0%
 
Lowercase_Letter 2 10.0%
 
ValueCountFrequency (%) 
Latin 20 100.0%
 
ValueCountFrequency (%) 
ASCII 20 100.0%
 

Rateio_Sena
Categorical

HIGH CARDINALITY
Distinct count514
Unique (%)23.2%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
1706
40781877,95
 
1
50968412,58
 
1
7222388,9
 
1
51890452,61
 
1
Other values (509)
509
ValueCountFrequency (%) 
0 1706 76.9%
 
40781877,95 1 < 0.1%
 
50968412,58 1 < 0.1%
 
7222388,9 1 < 0.1%
 
51890452,61 1 < 0.1%
 
21024430,31 1 < 0.1%
 
21815376,29 1 < 0.1%
 
4792658,35 1 < 0.1%
 
61429947,73 1 < 0.1%
 
19413790,51 1 < 0.1%
 
Other values (504) 504 22.7%
 

Length

Max length12
Mean length3.187021181
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

Ganhadores_Quina
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count334
Unique (%)15.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean114.2735466
Minimum4
Maximum7688
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum4
5-th percentile27
Q149
median77
Q3120
95-th percentile266.2
Maximum7688
Range7684
Interquartile range (IQR)71

Descriptive statistics

Standard deviation238.6799765
Coefficient of variation (CV)2.088672169
Kurtosis544.0971593
Mean114.2735466
Median Absolute Deviation (MAD)74.78828163
Skewness19.96629572
Sum253573
Variance56968.13118
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[4.000e+00 1.250e+01 1.950e+01 2.650e+01 8.250e+01 ... 2.835e+02 4.665e+02 9.870e+02 1.613e+03 7.688e+03], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
46 33 1.5%
 
47 29 1.3%
 
43 28 1.3%
 
49 27 1.2%
 
41 27 1.2%
 
66 26 1.2%
 
48 26 1.2%
 
56 25 1.1%
 
74 25 1.1%
 
36 25 1.1%
 
Other values (324) 1948 87.8%
 
ValueCountFrequency (%) 
4 1 < 0.1%
 
6 1 < 0.1%
 
9 1 < 0.1%
 
12 1 < 0.1%
 
13 3 0.1%
 
ValueCountFrequency (%) 
7688 1 < 0.1%
 
4862 1 < 0.1%
 
3001 1 < 0.1%
 
2581 1 < 0.1%
 
1665 1 < 0.1%
 

Rateio_Quina
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE
Distinct count2219
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
33116,63
 
1
7995,63
 
1
43085,53
 
1
15170,39
 
1
41286,03
 
1
Other values (2214)
2214
ValueCountFrequency (%) 
33116,63 1 < 0.1%
 
7995,63 1 < 0.1%
 
43085,53 1 < 0.1%
 
15170,39 1 < 0.1%
 
41286,03 1 < 0.1%
 
20304,76 1 < 0.1%
 
7904,4 1 < 0.1%
 
16865,13 1 < 0.1%
 
8993,04 1 < 0.1%
 
9839,13 1 < 0.1%
 
Other values (2209) 2209 99.5%
 

Length

Max length9
Mean length7.761153673
Min length4
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

Ganhadores_Quadra
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count2005
Unique (%)90.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7298.292925
Minimum683
Maximum303857
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum683
5-th percentile2311.6
Q13807.5
median5453
Q37842.5
95-th percentile15658.3
Maximum303857
Range303174
Interquartile range (IQR)4035

Descriptive statistics

Standard deviation10801.77676
Coefficient of variation (CV)1.480041549
Kurtosis320.2279672
Mean7298.292925
Median Absolute Deviation (MAD)3973.157272
Skewness14.78411613
Sum16194912
Variance116678381.2
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 683. 1438. 1938. 2618. 5885.5 ... 14211.5 22168.5 31252. 60107. 303857. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
5886 5 0.2%
 
2892 3 0.1%
 
1865 3 0.1%
 
5957 3 0.1%
 
5172 3 0.1%
 
2088 3 0.1%
 
3828 3 0.1%
 
2722 3 0.1%
 
3770 3 0.1%
 
3752 3 0.1%
 
Other values (1995) 2187 98.6%
 
ValueCountFrequency (%) 
683 1 < 0.1%
 
689 1 < 0.1%
 
965 1 < 0.1%
 
1051 1 < 0.1%
 
1173 1 < 0.1%
 
ValueCountFrequency (%) 
303857 1 < 0.1%
 
173428 1 < 0.1%
 
168546 1 < 0.1%
 
124889 1 < 0.1%
 
113258 1 < 0.1%
 

Rateio_Quadra
Categorical

HIGH CARDINALITY
UNIFORM
Distinct count2174
Unique (%)98.0%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
159,45
 
3
354,7
 
3
188,18
 
2
203,03
 
2
296,08
 
2
Other values (2169)
2207
ValueCountFrequency (%) 
159,45 3 0.1%
 
354,7 3 0.1%
 
188,18 2 0.1%
 
203,03 2 0.1%
 
296,08 2 0.1%
 
231,66 2 0.1%
 
197,31 2 0.1%
 
142,48 2 0.1%
 
233,71 2 0.1%
 
250,27 2 0.1%
 
Other values (2164) 2197 99.0%
 

Length

Max length7
Mean length5.875619648
Min length3
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

Acumulado
Categorical

Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
SIM
1706
N�O
513
ValueCountFrequency (%) 
SIM 1706 76.9%
 
N�O 513 23.1%
 

Length

Max length3
Mean length3
Min length3
ValueCountFrequency (%) 
Uppercase_Letter 5 83.3%
 
Other_Symbol 1 16.7%
 
ValueCountFrequency (%) 
Latin 5 83.3%
 
Common 1 16.7%
 
ValueCountFrequency (%) 
ASCII 5 83.3%
 
Specials 1 16.7%
 

Valor_Acumulado
Categorical

HIGH CARDINALITY
Distinct count1707
Unique (%)76.9%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
513
2116308,12
 
1
9795115,03
 
1
1780905,8
 
1
9522864,81
 
1
Other values (1702)
1702
ValueCountFrequency (%) 
0 513 23.1%
 
2116308,12 1 < 0.1%
 
9795115,03 1 < 0.1%
 
1780905,8 1 < 0.1%
 
9522864,81 1 < 0.1%
 
18200523,7 1 < 0.1%
 
13993542,04 1 < 0.1%
 
12817492,36 1 < 0.1%
 
2418938,4 1 < 0.1%
 
851497,47 1 < 0.1%
 
Other values (1697) 1697 76.5%
 

Length

Max length12
Mean length8.245606129
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

Estimativa_Pr�mio
Real number (ℝ≥0)

ZEROS
Distinct count157
Unique (%)7.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13937945.02
Minimum0
Maximum300000000
Zeros860
Zeros (%)38.8%
Memory size17.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median3000000
Q320000000
95-th percentile50000000
Maximum300000000
Range300000000
Interquartile range (IQR)20000000

Descriptive statistics

Standard deviation26750745.29
Coefficient of variation (CV)1.919274703
Kurtosis36.218671
Mean13937945.02
Median Absolute Deviation (MAD)16029056.77
Skewness4.957342869
Sum3.09283e+10
Variance7.156023738e+14
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[0.000e+00 5.000e+05 1.150e+06 1.400e+06 1.550e+06 ... 4.050e+07 5.250e+07 7.600e+07 1.375e+08 3.000e+08], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0 860 38.8%
 
3000000 105 4.7%
 
2500000 75 3.4%
 
2000000 70 3.2%
 
5000000 44 2.0%
 
30000000 39 1.8%
 
25000000 36 1.6%
 
6000000 36 1.6%
 
7000000 34 1.5%
 
20000000 31 1.4%
 
Other values (147) 889 40.1%
 
ValueCountFrequency (%) 
0 860 38.8%
 
1000000 2 0.1%
 
1100000 1 < 0.1%
 
1200000 7 0.3%
 
1300000 6 0.3%
 
ValueCountFrequency (%) 
300000000 1 < 0.1%
 
280000000 3 0.1%
 
275000000 1 < 0.1%
 
240000000 1 < 0.1%
 
230000000 1 < 0.1%
 

Acumulado_Mega_da_Virada
Categorical

HIGH CARDINALITY
Distinct count1217
Unique (%)54.8%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
0
1003
1889008,61
 
1
23691363,83
 
1
15445992,99
 
1
35081085,93
 
1
Other values (1212)
1212
ValueCountFrequency (%) 
0 1003 45.2%
 
1889008,61 1 < 0.1%
 
23691363,83 1 < 0.1%
 
15445992,99 1 < 0.1%
 
35081085,93 1 < 0.1%
 
19840265,97 1 < 0.1%
 
5970146,17 1 < 0.1%
 
26320721,89 1 < 0.1%
 
86444,79 1 < 0.1%
 
53804975,33 1 < 0.1%
 
Other values (1207) 1207 54.4%
 

Length

Max length12
Mean length6.301036503
Min length1
ValueCountFrequency (%) 
Decimal_Number 10 90.9%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

D1
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.56106354
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile4
Q116
median31
Q346
95-th percentile57
Maximum60
Range59
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.31535952
Coefficient of variation (CV)0.5665823605
Kurtosis-1.197267312
Mean30.56106354
Median Absolute Deviation (MAD)14.94274467
Skewness-0.02374603654
Sum67815
Variance299.8216753
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 59.5 60. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
28 53 2.4%
 
4 49 2.2%
 
49 47 2.1%
 
30 46 2.1%
 
47 46 2.1%
 
59 44 2.0%
 
35 43 1.9%
 
32 43 1.9%
 
44 43 1.9%
 
27 43 1.9%
 
Other values (50) 1762 79.4%
 
ValueCountFrequency (%) 
1 34 1.5%
 
2 43 1.9%
 
3 27 1.2%
 
4 49 2.2%
 
5 38 1.7%
 
ValueCountFrequency (%) 
60 36 1.6%
 
59 44 2.0%
 
58 26 1.2%
 
57 31 1.4%
 
56 39 1.8%
 

D2
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.08292023
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile3
Q115
median30
Q345
95-th percentile57
Maximum60
Range59
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.40237153
Coefficient of variation (CV)0.5784801276
Kurtosis-1.237111547
Mean30.08292023
Median Absolute Deviation (MAD)15.15949172
Skewness0.002090724933
Sum66754
Variance302.842535
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 60. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
5 55 2.5%
 
32 49 2.2%
 
17 48 2.2%
 
21 47 2.1%
 
53 47 2.1%
 
10 47 2.1%
 
42 46 2.1%
 
8 45 2.0%
 
33 45 2.0%
 
11 44 2.0%
 
Other values (50) 1746 78.7%
 
ValueCountFrequency (%) 
1 40 1.8%
 
2 37 1.7%
 
3 41 1.8%
 
4 31 1.4%
 
5 55 2.5%
 
ValueCountFrequency (%) 
60 25 1.1%
 
59 35 1.6%
 
58 37 1.7%
 
57 30 1.4%
 
56 42 1.9%
 

D3
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.79134745
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile4
Q116
median31
Q346
95-th percentile58
Maximum60
Range59
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.36442257
Coefficient of variation (CV)0.5639383791
Kurtosis-1.189359197
Mean30.79134745
Median Absolute Deviation (MAD)14.99273451
Skewness-0.0105751407
Sum68326
Variance301.5231713
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 60. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
18 52 2.3%
 
27 52 2.3%
 
54 48 2.2%
 
58 47 2.1%
 
56 47 2.1%
 
4 46 2.1%
 
24 46 2.1%
 
29 45 2.0%
 
37 44 2.0%
 
34 44 2.0%
 
Other values (50) 1748 78.8%
 
ValueCountFrequency (%) 
1 41 1.8%
 
2 38 1.7%
 
3 31 1.4%
 
4 46 2.1%
 
5 32 1.4%
 
ValueCountFrequency (%) 
60 31 1.4%
 
59 41 1.8%
 
58 47 2.1%
 
57 36 1.6%
 
56 47 2.1%
 

D4
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.81478143
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile4
Q116
median31
Q345
95-th percentile58
Maximum60
Range59
Interquartile range (IQR)29

Descriptive statistics

Standard deviation17.27227036
Coefficient of variation (CV)0.5605189963
Kurtosis-1.182374478
Mean30.81478143
Median Absolute Deviation (MAD)14.9065153
Skewness-0.02897220929
Sum68378
Variance298.3313233
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 1.5 59.5 60. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
37 54 2.4%
 
36 48 2.2%
 
54 46 2.1%
 
29 45 2.0%
 
60 45 2.0%
 
18 44 2.0%
 
1 44 2.0%
 
43 44 2.0%
 
53 43 1.9%
 
5 43 1.9%
 
Other values (50) 1763 79.5%
 
ValueCountFrequency (%) 
1 44 2.0%
 
2 32 1.4%
 
3 24 1.1%
 
4 35 1.6%
 
5 43 1.9%
 
ValueCountFrequency (%) 
60 45 2.0%
 
59 34 1.5%
 
58 40 1.8%
 
57 33 1.5%
 
56 31 1.4%
 

D5
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.42361424
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile4
Q115
median31
Q345
95-th percentile57
Maximum60
Range59
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.18679273
Coefficient of variation (CV)0.564916206
Kurtosis-1.198977134
Mean30.42361424
Median Absolute Deviation (MAD)14.89063825
Skewness-0.01804537533
Sum67510
Variance295.3858443
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 59.5 60. ], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
35 48 2.2%
 
28 47 2.1%
 
16 46 2.1%
 
44 45 2.0%
 
10 45 2.0%
 
45 45 2.0%
 
12 45 2.0%
 
52 44 2.0%
 
34 44 2.0%
 
24 44 2.0%
 
Other values (50) 1766 79.6%
 
ValueCountFrequency (%) 
1 30 1.4%
 
2 41 1.8%
 
3 39 1.8%
 
4 37 1.7%
 
5 35 1.6%
 
ValueCountFrequency (%) 
60 37 1.7%
 
59 35 1.6%
 
58 32 1.4%
 
57 34 1.5%
 
56 33 1.5%
 

D6
Real number (ℝ≥0)

HIGH CORRELATION
Distinct count60
Unique (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.22352411
Minimum1
Maximum60
Zeros0
Zeros (%)0.0%
Memory size17.5 KiB

Quantile statistics

Minimum1
5-th percentile4
Q115.5
median30
Q345
95-th percentile57
Maximum60
Range59
Interquartile range (IQR)29.5

Descriptive statistics

Standard deviation17.20366953
Coefficient of variation (CV)0.5692145452
Kurtosis-1.196924902
Mean30.22352411
Median Absolute Deviation (MAD)14.87599069
Skewness0.01582472468
Sum67066
Variance295.9662453
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[ 1. 60.], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
23 52 2.3%
 
33 47 2.1%
 
17 45 2.0%
 
34 44 2.0%
 
10 44 2.0%
 
30 44 2.0%
 
53 43 1.9%
 
22 43 1.9%
 
27 43 1.9%
 
4 42 1.9%
 
Other values (50) 1772 79.9%
 
ValueCountFrequency (%) 
1 32 1.4%
 
2 35 1.6%
 
3 40 1.8%
 
4 42 1.9%
 
5 42 1.9%
 
ValueCountFrequency (%) 
60 30 1.4%
 
59 31 1.4%
 
58 33 1.5%
 
57 40 1.8%
 
56 36 1.6%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

ConcursoData Sorteio1 Dezena2 Dezena3 Dezena4 Dezena5 Dezena6 DezenaArrecadacao_TotalGanhadores_SenaCidadeUFRateio_SenaGanhadores_QuinaRateio_QuinaGanhadores_QuadraRateio_QuadraAcumuladoValor_AcumuladoEstimativa_Pr�mioAcumulado_Mega_da_ViradaD1D2D3D4D5D6
0111/03/1996415452303300NaNNaN01739158,922016330,21SIM1714650,23004154523033
1218/03/19969393749434101NaNPR2307162,236514424,024488208,91N�O00093937494341
2325/03/199636301011294702NaNRN391192,516210515,934261153,01N�O000363010112947
3401/04/199665942271500NaNNaN03915322,243311180,48SIM717080,7500659422715
4508/04/199611946616200NaNNaN0985318,1539996,53SIM1342488,8500119466162
5615/04/19961940713224700NaNNaN01097214,667147110,03SIM2286166,330019407132247
6722/04/1996563821203500NaNNaN01008746,055736152,48SIM3335692,28005638212035
7829/04/19965317384473700NaNNaN06016084,115262183,4SIM4493748,190053173844737
8906/05/19965543565486000NaNNaN01760043,792175469,31SIM5718641,490055435654860
91013/05/19962541857213800NaNNaN02516638,4512590132,35SIM13334769,810025418572138

Last rows

ConcursoData Sorteio1 Dezena2 Dezena3 Dezena4 Dezena5 Dezena6 DezenaArrecadacao_TotalGanhadores_SenaCidadeUFRateio_SenaGanhadores_QuinaRateio_QuinaGanhadores_QuadraRateio_QuadraAcumuladoValor_AcumuladoEstimativa_Pr�mioAcumulado_Mega_da_ViradaD1D2D3D4D5D6
2209221023/11/201911333425241751380914,50NaNNaN08136572,745910716,07SIM32024497,143800000087374944,04113334252417
2210221127/11/201927443654414053416021,50NaNNaN05259225,5336811195,22SIM37697679,244400000088185398,63274436544140
2211221230/11/2019515853522623598473900NaNNaN06156566,095215945,22SIM44053920,915000000089093433,17515853522623
2212221304/12/20197510603246665239501S�O GON�ALORJ51119263,339639952,837360744,46N�O0300000090102767,87510603246
2213221407/12/201918304103447315186300NaNNaN04441300,513223805,47SIM20884230,042500000090580984,2318304103447
2214221511/12/201921234311933442966500NaNNaN04853207,24092891,61SIM25588866,493100000091253075,1821234311933
2215221614/12/201924424810434952692673,50NaNNaN04962000,464361995,19SIM31185223,613600000092052554,77244248104349
2216221717/12/2019163236301410364417380NaNNaN05141197,383258921,27SIM35055609,493900000092605467,06163236301410
2217221819/12/20191663822524847115841,51FRANCASP40059665,224264678,2833451160,14N�O0250000093320332,1816638225248
2218221921/12/201957594582836279980550NaNNaN02662086,2222351031,79SIM110640988,95300000000110640988,9557594582836